Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher.
Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?
Some links on this page may take you to non-federal websites. Their policies may differ from this site.
-
Parkhill, Julian (Ed.)ABSTRACT RNA transcripts are potential therapeutic targets, yet bacterial transcripts have uncharacterized biodiversity. We developed an algorithm for transcript prediction called tp.py using it to predict transcripts (mRNA and other RNAs) inEscherichia coliK12 and E2348/69 strains (Bacteria:gamma-Proteobacteria),Listeria monocytogenesstrains Scott A and RO15 (Bacteria:Firmicute),Pseudomonas aeruginosastrains SG17M and NN2 strains (Bacteria:gamma-Proteobacteria), andHaloferax volcanii(Archaea:Halobacteria). From >5 millionE. coliK12 and >3 millionE. coliE2348/69 newly generated Oxford Nanopore Technologies direct RNA sequencing reads, 2,487 K12 mRNAs and 1,844 E2348/69 mRNAs were predicted, with the K12 mRNAs containing more than half of the predictedE. coliK12 proteins. While the number of predicted transcripts varied by strain based on the amount of sequence data used, across all strains examined, the predicted average size of the mRNAs was 1.6–1.7 kbp, while the median size of the 5′- and 3′-untranslated regions (UTRs) were 30–90 bp. Given the lack of bacterial and archaeal transcript annotation, most predictions were of novel transcripts, but we also predicted many previously characterized mRNAs and ncRNAs, including post-transcriptionally generated transcripts and small RNAs associated with pathogenesis in theE. coliE2348/69LEEpathogenicity islands. We predicted small transcripts in the 100–200 bp range as well as >10 kbp transcripts for all strains, with the longest transcript for two of the seven strains being thenuooperon transcript, and for another two strains it was a phage/prophage transcript. This quick, easy, and reproducible method will facilitate the presentation of transcripts, and UTR predictions alongside coding sequences and protein predictions in bacterial genome annotation as important resources for the research community.IMPORTANCEOur understanding of bacterial and archaeal genes and genomes is largely focused on proteins since there have only been limited efforts to describe bacterial/archaeal RNA diversity. This contrasts with studies on the human genome, where transcripts were sequenced prior to the release of the human genome over two decades ago. We developed software for the quick, easy, and reproducible prediction of bacterial and archaeal transcripts from Oxford Nanopore Technologies direct RNA sequencing data. These predictions are urgently needed for more accurate studies examining bacterial/archaeal gene regulation, including regulation of virulence factors, and for the development of novel RNA-based therapeutics and diagnostics to combat bacterial pathogens, like those with extreme antimicrobial resistance.more » « less
-
Parkhill, Julian (Ed.)ABSTRACT DiarrheagenicEscherichia coli, collectively known as DEC, is a leading cause of diarrhea, particularly in children in low- and middle-income countries. Diagnosing infections caused by different DEC pathotypes traditionally relies on the cultivation and identification of virulence genes, a resource-intensive and error-prone process. Here, we compared culture-based DEC identification with shotgun metagenomic sequencing of whole stool using 35 randomly drawn samples from a cohort of diarrhea-afflicted patients. Metagenomic sequencing detected the cultured isolates in 97% of samples, revealing, overall, reliable detection by this approach. Genome binning yielded high-qualityE. colimetagenome-assembled genomes (MAGs) for 13 samples, and we observed that the MAG did not carry the diagnostic DEC virulence genes of the corresponding isolate in 60% of these samples. Specifically, two distinct scenarios were observed: diffusely adherentE. coli(DAEC) isolates without corresponding DAEC MAGs appeared to be relatively rare members of the microbiome, which was further corroborated by quantitative PCR (qPCR), and thus unlikely to represent the etiological agent in 3 of the 13 samples (~23%). In contrast, ETEC virulence genes were located on plasmids and largely escaped binning in associated MAGs despite being prevalent in the sample (5/13 samples or ~38%), revealing limitations of the metagenomic approach. These results provide important insights for diagnosing DEC infections and demonstrate how metagenomic methods can complement isolation efforts and PCR for pathogen identification and population abundance. IMPORTANCEDiagnosing enteric infections based on traditional methods involving isolation and PCR can be erroneous due to isolation and other biases, e.g., the most abundant pathogen may not be recovered on isolation media. By employing shotgun metagenomics together with traditional methods on the same stool samples, we show that mixed infections caused by multiple pathogens are much more frequent than traditional methods indicate in the case of acute diarrhea. Further, in at least 8.5% of the total samples examined, the metagenomic approach reliably identified a different pathogen than the traditional approach. Therefore, our results provide a methodology to complement existing methods for enteric infection diagnostics with cutting-edge, culture-independent metagenomic techniques, and highlight the strengths and limitations of each approach.more » « less
-
Wilson, Daniel; Parkhill, Julian (Ed.)ABSTRACT A goal of modern biology is to develop the genotype-phenotype (G→P) map, a predictive understanding of how genomic information generates trait variation that forms the basis of both natural and managed communities. As microbiome research advances, however, it has become clear that many of these traits are symbiotic extended phenotypes , being governed by genetic variation encoded not only by the host’s own genome, but also by the genomes of myriad cryptic symbionts. Building a reliable G→P map therefore requires accounting for the multitude of interacting genes and even genomes involved in symbiosis. Here, we use naturally occurring genetic variation in 191 strains of the model microbial symbiont Sinorhizobium meliloti paired with two genotypes of the host Medicago truncatula in four genome-wide association studies (GWAS) to determine the genomic architecture of a key symbiotic extended phenotype— partner quality , or the fitness benefit conferred to a host by a particular symbiont genotype, within and across environmental contexts and host genotypes. We define three novel categories of loci in rhizobium genomes that must be accounted for if we want to build a reliable G→P map of partner quality; namely, (i) loci whose identities depend on the environment, (ii) those that depend on the host genotype with which rhizobia interact, and (iii) universal loci that are likely important in all or most environments. IMPORTANCE Given the rapid rise of research on how microbiomes can be harnessed to improve host health, understanding the contribution of microbial genetic variation to host phenotypic variation is pressing, and will better enable us to predict the evolution of (and select more precisely for) symbiotic extended phenotypes that impact host health. We uncover extensive context-dependency in both the identity and functions of symbiont loci that control host growth, which makes predicting the genes and pathways important for determining symbiotic outcomes under different conditions more challenging. Despite this context-dependency, we also resolve a core set of universal loci that are likely important in all or most environments, and thus, serve as excellent targets both for genetic engineering and future coevolutionary studies of symbiosis.more » « less
An official website of the United States government
